AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Multimodal framework

# Multimodal framework

Data2vec Audio Base 960h
Apache-2.0
Data2Vec is a general self-supervised learning framework applicable to speech, vision, and language processing. This model is a speech recognition model pre-trained and fine-tuned on 960 hours of LibriSpeech audio data.
Speech Recognition Transformers English
D
facebook
10.61k
12
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase